Mapping queries to the Linking Open Data cloud: A case study using DBpedia

نویسندگان

  • Edgar Meij
  • Marc Bron
  • Laura Hollink
  • Bouke Huurnink
  • Maarten de Rijke
چکیده

1570-8268/$ see front matter 2011 Elsevier B.V. A doi:10.1016/j.websem.2011.04.001 q This paper is an extended and revised version of ⇑ Corresponding author. Tel.: +31 205257565; fax: E-mail addresses: [email protected] (E. Meij), [email protected] (L. Hollink), [email protected] ( (M. de Rijke). We introduce the task of mapping search engine queries to DBpedia, a major linking hub in the Linking Open Data cloud. We propose and compare various methods for addressing this task, using a mixture of information retrieval and machine learning techniques. Specifically, we present a supervised machine learning-based method to determine which concepts are intended by a user issuing a query. The concepts are obtained from an ontology and may be used to provide contextual information, related concepts, or navigational suggestions to the user submitting the query. Our approach first ranks candidate concepts using a language modeling for information retrieval framework. We then extract query, concept, and search-history feature vectors for these concepts. Using manual annotations we inform a machine learning algorithm that learns how to select concepts from the candidates given an input query. Simply performing a lexical match between the queries and concepts is found to perform poorly and so does using retrieval alone, i.e., omitting the concept selection stage. Our proposed method significantly improves upon these baselines and we find that support vector machines are able to achieve the best performance out of the machine learning algorithms evaluated. 2011 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mapping WordNet Instances to Wikipedia

Lexical resource differ from encyclopaedic resources and represent two distinct types of resource covering general language and named entities respectively. However, many lexical resources, including Princeton WordNet, contain many proper nouns, referring to named entities in the world yet it is not possible or desirable for a lexical resource to cover all named entities that may reasonably occ...

متن کامل

Publishing Data from the Smithsonian American Art Museum as Linked Open Data

Museums around the world have built databases with metadata about millions of objects, the people who created them, and the entities they represent. This data is stored in proprietary databases and is not readily available for use. Recently, museums embraced the Semantic Web as a means to make this data available to the world, but the experience so far shows that publishing museum data to the L...

متن کامل

LOQUS: Linked Open Data SPARQL Querying System

The LOD cloud is gathering a lot of momentum, with the number of contributors growing manifold. Many prominent data providers have submitted and linked their data to other dataset with the help of manual mappings. The potential of the LOD cloud is enormous ranging from challenging AI issues such as open domain question answering to automated knowledge discovery. We believe that there is not eno...

متن کامل

Augmenting a Feature Set of Movies Using Linked Open Data

Augmenting a feature set using mappings to the Web of data is an up-and-coming way to enrich data in the original dataset. Those enrichments are valuable especially for the recent preference learning algorithms and recommender systems. In this paper, we describe the process of mapping and augmenting the movie ratings dataset MovieTweetings from the perspective of RecSysRules 2015 Challenge. The...

متن کامل

SESOS: A Verifiable Searchable Outsourcing Scheme for Ordered Structured Data in Cloud Computing

While cloud computing is growing at a remarkable speed, privacy issues are far from being solved. One way to diminish privacy concerns is to store data on the cloud in encrypted form. However, encryption often hinders useful computation cloud services. A theoretical approach is to employ the so-called fully homomorphic encryption, yet the overhead is so high that it is not considered a viable s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Web Sem.

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2011